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The weakest pre-expectation calculus 1201 has been proved to be a mature theory to analyze quan- 
titative properties of probabilistic and nondeterministic programs. We present an automatic method 
for proving quantitative linear properties on any denumerable state space using iterative backwards 
fixed point calculation in the general framework of abstract interpretation. In order to accomplish 
this task we present the technique of random variable abstraction (RVA) and we also postulate a suf- 
ficient condition to achieve exact fixed point computation in the abstract domain. The feasibility of 
our approach is shown with two examples, one obtaining the expected running time of a probabilistic 
program, and the other the expected gain of a gambling strategy. 

Our method works on general guarded probabilistic and nondeterministic transition systems in- 
stead of plain pGCL programs, allowing us to easily model a wide range of systems including dis- 
tributed ones and unstructured programs. We present the operational and wp semantics for this pro- 
grams and prove its equivalence. 

1 Introduction 

Automatic probabilistic program verification has been a field of active research in the last two decades. 
The two major approaches to tackle this problem have been model checking B21 I91I2, 11], and theorem 
proving lfT2l l3l. Traditionally model checking has been targeted to be a push-button technique, but in the 
quest of full automation some restrictions apply. The most prominent one is finiteness of the state space, 
that leads to finite-state Markov chains (MC) and Markov decision processes (MDP). These models are 
usually verified against probabilistic extensions of temporal logics such as PCTL J9j|2]|. Even with the 
finiteness restriction, the state explosion problem have to be alleviated using, for example, partial order 
reduction techniques HJ. Another approach is taken in the PASS tool lf23l[T0l . where the authors profit 
from the work on predicate abstraction [|8] H in the general framework of abstract interpretation ||5l in 
order to aggregate states conveniently and prevent the state explosion problem or even handle infinite 
state spaces. Theorem proving techniques can also overcome state explosion and infinite state space 
problems but at a cost, the automation is up-to loop invariants, that is, once the correct loop invariants 
are fixed, the theorem prover can automatically prove the desired property lTT2l . There are also mixed 
approaches like ifTTI in the realm of refinement checking of pGCL programs. 

Automatic probabilistic program verification cannot, to the author's knowledge, tackle the problem 
of performance measurement of possibly unbounded program variables. Typical examples of this quanti- 
tative measurements are expected number of rounds of a probabilistic algorithm, or the expected revenue 
of a gambling strategy. In this kind of problems, the model checking approach discretize the variable 
up to a bound, either statically or using counterexample-guided abstraction refinement (CEGAR) ifTUl . 
but since the values it can reach are unbounded, an approximation is needed at some point. In theorem 
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proving this problem could be solved but the invariants have to devised, and anyone involved in this task 
knows that this is far from trivial. 

We propose the computation of parametric linear invariants on a set of predefined predicates, and we 
call this random variable abstraction (RVA) as it generalizes predicate abstraction (PA) to a quantitative 
logic based on expectations [201 • Our motivation is simple. If the underlying invariants of the program 
can be captured as a sum of linear random variables in disjoint regions of the state space, a suitable fixed 
point computation should be able to discover the coefficients of such a linear expressions. To compute the 
coefficients we depart from the traditional forward semantics approach, and use weakest pre-expectation 
(wp) backwards semantics [20]. This combination of predicate abstraction, linear random variables 
and quantitative wp, renders a new technique that can compute, for example, expected running time of 
probabilistic algorithms, even though this quantities are not a-priori bound by any constant. This allows 
us to reason about efficiency of algorithms. 

Parametric linear invariants for probabilistic and nondeterministic programs using expectation logic 
are also generated in |[T4ll . but our work differs in two aspects. First, in [14] all random variables have 
to be one-bounded or equivalently they should be bounded by a constant. Second, in the computation 
technique for the coefficients, they cast the problem to one of constraint-solving and resort to off-the- 
shelf constraint solvers, while we use a fixed point computation. 

Outline. In Section [2] we first give a motivating example that briefly shows the type of problems we 
address as well as the mechanics to obtain the result. Then in Section [3] the probabilistic and nondeter- 
ministic programs, its operational and wp semantics are presented, as well as the fixed point computation 
in the concrete domain of (general) random variables. Section |4] covers the abstract domain of random 
variables, its semantics, and the fixed point computation, as well as the general problems to face in this 
process. A more involved example is given in Section [5] Section [6] concludes the paper and discusses 
future work. 

2 Motivating Example 

In order to give some intuition on what we are going to develop throughout the paper, we present a purely 
probabilistic program Pq that generates a geometric distribution. This program halts with probability 1 

Po = x,i := 1,0 
;do x / — > 

i:=0©a:= 1 

2 

;i := i + 1 

od 

Figure 1 : Geometric distribution. 

and variable i could reach any natural number with positive probability. The semantics of this program 
can be regarded as a probability measure A over the naturals with distribution A.{K < /} = ^f. The 
expected running time of the algorithm is precisely the expectation of random variable i with respect to 
the measure A, that is j A i. Informally we could do this quantitative analysis as follows: the first iteration 
is certain, however the second one occurs with probability one half, the third one will occur with half 
the probability of previous one, and so on. In [20, p. 69] the same calculation is done by hand, using the 
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invariant (i+K) if (x 7^ 0) else i, the equations fix K = 2 and this implies that given the initialization, the 
expected value of random variable i is 2. 

Our approach establishes a recursive higher-order function / on the domain of random variables 
using weakest pre-expectation semantics [20]. It computes the expectation of taking a loop cycle and 
repeating, or ending. 

f.X= [x^O] xwp.(x:=0®ix:= 1; i := i+ l).X + [x = 0] x i 

We are looking for & fixed point of this functional, namely f.(p = (p, but taking a specific subset of 
random variables: two linear functions on the post-expectation i, one for each region defined by the 
predicates that control the flow of the program. It is a parametric random variable with coefficients 

((ai,ao),(bi,bo)). 

Inv = (a\i + ao)[x = 0] + (b\i + bo)[x / 0] 
Calculating / in the random variable Inv: 

f.Inv = (li + 0)[* = 0] + + )[ x £ 0] . 

If we only look at the coefficients, / can be (exactly) captured by function /" on the domain R( 1+1 ) x2 . 

/».(( fll ,ao),(fei,feo)) = ((l,0),(^, fln+fl| + fco+fcl )) . 

The computation goes as follows, from the random variable that is constantly represented by the tuple 
( (0, 0) , (0, 0) ) we apply the update rule iteratively until the floating point precision produces a fixed point 
after a little less than half hundred iterations (Table [T]). 





Coefficients 


Iter. 


(flu ao) 


(bubo) 





(0, 0) 


(0, 0) 


f 


(1,0) 


(0, 0) 


2 


(1,0) 


(0.5, 0.5) 


3 


(1,0) 


(0.75, 1) 


4 


(1,0) 


(0.875, 1.375) 


5 


(1,0) 


(0.9375, 1.625) 


6 


(1,0) 


(0.96875, 1.78125) 


7 


(1,0) 


(0.984375, 1.875) 


45 


(1,0) 


(1,2) 



Table 1 : Iteration for the geometric distribution program. 

These fixed point coefficients represent the random variable i x [x = 0] + (i + 2) x [x / 0] that is a 
valid pre-expectation of the repeating construct with respect to post-expectation i. If we take the weakest 
pre-expectation of initialization (syntactic substitution) we obtain the value 2. Note that invariant and 
expectation of random variable i coincide with the calculations done by hand in [20]. 
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3 Concrete Semantics 

3.1 Probabilistic Programs 

We fix a finite set of variables X. A state is a function from variables to the semantic domain s : X — > CI, 
and the set of all states is denoted by 5. An expression is a function from states to the semantic domain 
j6 : S — > £2, while a boolean expression is a function G : S — » {true, false}, and an assignment E : S — ^ 5 
is a state transformer. We define substitution of expression j8 with assignment E, P(E) as function 
composition j8 o£\ 

A guarded command (G — > E\ @p\ \ ■ ■ ■ \Ej c @pi c ) consists of a boolean guard G and assignments 
Ei,--- ,Ek weighted with probabilities pi, - ■■ ,pk where £* =1 < 1. 

A program P = (S,1,C) consists of a boolean expression 1 that defines the set of initial states and a 
finite set of guarded commands C. 

Note that using this guarded commands we can underspecify (in the sense of refinement) in two ways. 
One way is using guard overlapping, so in the non-null intersection of Go and G\ there is a nondeter- 
ministic choice between the two multi-way probabilistic assignments. The other is using subprobability 
distributions, since the expression (Eq @ \ \E\ @ \ ) can be regarded as all the probability distributions that 
assign at least ^ to each branch [7]. Later, in the operational model, we will see that the first one is 
handled by convex combination closure, while the later is handled by up closure. 

3.2 Operational model 

We fix the semantic domain Q. as a denumerable set, for example the rationals. The power set is denoted 
by P, and if the empty set is not included we denote it P + . 

The set of (discrete) sub-probability measures over S is S = {A : S — > [0..1] | £A < 1}. The partial 
order A C A' on S is defined pointwise, and denotes the least element everywhere defined 0. A set of 
sub-probability measures Z, C S is up-closed if A G i§ and A C A' then A' 6 i§ . Similarly § is convex closed 
if for all p £ [0..1] and Ao,Ai G ^ implies /?Ao + (1 — p)A\ G ^. Finally § is Cauchy-closed or simply 
closed if it contains its boundary, that is for every sequence {A,} C E,, such that A, ^? A, then A G 
CS is the set of non-empty, up-closed, convex and Cauchy-closed subsets of S. The set of probabilistic 
programs over S is defined MS = S — > CS, and it is ordered pointwise with C . 

The probabilistic and nondeterministic transition system M. is a tuple (S,1,T) where S is the set of 
states, X C 5 is a set of initial states, and an 7~G MS is called transition function. 

The semantics of a program P = (S,2,C) is the tuple M. = (S,2,T) with the same state space and 
initial states, where T.s is defined by its generators. 

Definition 3.1. Given the program P = (S,2, C), where C = {i: [0..1) • (G t E\ @p\ | • • • \E[, @ /?[.)}, the 
set of generators of T.s is GT.s = {i : [O.i) | G;.s • A,-} if (3i : [O.J) • G{.s), otherwise GT.s = {0}, where 
A i .s' = (Zj:[l..k i ]\E i j .s = s'-p i f ) . 

The set GT.s is nonempty and closed. Now T.s can be defined from above as the minimum set over 
CS that includes GT.s. Note this set is well defined since GT.s C S G CS, and CS is closed by arbitrary 
intersections. We can also define T.s from below, taking the up and convex closure of the generator set 
GTs. There is no need to take the Cauchy closure since the generator set is closed and the operators that 
form the up and convex closure preserve this property. 
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3.3 Expectation-transformer semantics 

In the previous section a probabilistic and nondeterministic program semantics was defined as a transition 
function from states to a subset of subprobability distributions. This set was defined to be up and convex 
closed. 

Verification of this particular type of programs poses the question of what propositions are in this 
context. We take the approach by Kozen [15], where propositions Q : S — > {true, false} are generalized 
as random variables, that is a measurable function j8 : S — > M> from states to positive reals. The program 
P is taken as a distribution function A, and the program verification problem, formerly a boolean value 
P \= Q is cast into an integral f A fi that is the expectation of random variable j3 given distribution A. 
In a nutshell this is a quantitative logic based on expectations. Boolean operators are (no uniquely) 
generalized. The arithmetic counterpart of the implication is a ^ j8, that is pointwise inequality (Vs : 
5 • a.s < p.s). Conjunction A can be taken as multiplication x or minimum n , and disjunction V over 
non overlapping predicates is captured by addition + . The characteristic function operator [•] goes from 
the boolean to the arithmetic domain with the usual interpretation [true] = 1, [false] = 0, and using it we 
could define negation as subtracting the truth value from one [->Q] = 1 — [Q]. 

This model generalizes the deterministic one since it is proven that [P |= Q] = J P [Q] for predicate Q 
and deterministic program P. 

In |[T3ll He et al. extend Kozen's work to deal with nondeterminism as well as probabilism in the 
programs. The program verification problem is again a real value, namely the least expected value of 
the random variable with respect to the sub-probability distributions generated by the program (demonic 
nondeterminism). 

Probabilistic (and nondeterministic) program verification problem is usually defined in the program 
text itself as Hoare triple semantics or the equivalent weakest precondition calculus of Dijkstra lfT5ll20t . 
The triple {Q}P{Q'} = Q wp.P.Q' is generalized to {a}P{j8} = a ^ wp.P.fi, and the generalized wp 
is called weakest pre-expectation calculus or probabilistic wp semantics. 

In general, the program outcome depends on the input state s, therefore wp.P.p is a function that 
given the initial state establish a lower bound on the expectation of random variable j3 (it is exact in 
the case of a purely probabilistic program). Therefore, expectations as functions of the input are also 
random variables, hence we define the expectation space as KS = (S — > M>, ^). It is worth noticing that 
all functions in KS are measurable since S is taken to be denumerable (all subsets of S are measurable). 
This implies that is valid to write integrals like J A f with A £ 5 and / G ES. We define the expectation 
transformer space as the functions from expectations to expectations TS = (ES <— ES, C). 

The wp expectation calculus is defined structurally in the constructors of probabilistic and nonde- 
terministic programs (Fig. [2]). This function basically maps assignments to substitutions, choice and 
probabilistic choice to convex combination and nondeterministic choice to minimum. 

Inspired in this semantics, we are going to define a weakest pre-expectation of our program 
P= (S,Z,C). We first transform the program C into a semantically equivalent set but with pairwise 
disjoint guards. Let C = {i : [0..1) • (Gj — > E\ @p\\ ■ • • \E' k @p' k .)}■ We follow the standard approach of 
taking the complete boolean algebra [6] over the finite set of assertions {G;}. The new program P' is: 

P' = {I: P + [0../),/ :/• {Aj ->■ E[@p[\. • • 

where 

A I = (M:I-G i )A(Ai:[0..1)-I-^G i ) . (1) 

This program can be regarded as a pGCL program that consists of an if-else ladder of the atoms Aj. In 
each branch /, there is a [/[-way nondeterministic choice of &,-way probabilistic choice with i G /. The 
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wp.abort.fi 
wp.skip.fi 
wp.E.fi 
wp.(P ;Pi).p 
wp.{P (SpPi).p 
wp.(i{ G then P Q else Pi fl) ./3 

w P .(p np 1 ).p 

wp.{doG^Pod).j5 





wp.P .(wp.Pi.j3) 

p x wp.Po.fi + (1 — p) x wp.P\.fi 

wp.(P ® [G] Pi).p 

wp.P .pr\wp.Pi.p 

{fiX • [G] x wp.PJf + [-iG] x /3) 



Figure 2: w/j expectation calculus. 

w/7 semantics is defined as follows: 

w P y.p±(£l:F+[0..l)-\Ar}x(f\i:I- QTj: 1 ..A, •/,/ x /i /•;/))) . (2) 

3.4 Relational to expectation-transformer embedding 

We now define the notion of wp but in terms of the transition T. The injection wp G HS — >• TS is defined 
as the minimum expectation for random variable j8 over all nondeterministic choices of sub-probability 
distributions. 



wp.T.p.s= (\~\A:T.s- J Ji\ . 



(3) 



The next lemma shows that the syntactic definition Q inspired in pGCL wp calculus coincides with the 
semantic one Q. From now on we will use both definitions interchangeably. 

Lemma 3.2. The syntactic and semantic notions of weakest pre-expectation coincide, that is wp.P' ./3 .s = 
wp.T.fi.s . 

Proof. First suppose there is a valid Gj.s. We begin by the rhs of the equality, that by ([3]) is 
(nA : T.s-f A fi). The equality will be first proved for the generators GT.s C T.s, where GT.s = 
{i : [0../) | Gi.s'Ai} and Aj.s' = (£j : [l-k] \ E).s = s'-p)). Let I = {i : [0..Z) | d.s} the index set 
of valid guards, then the minimum over all expectations is (Hi : I • J Aj P). Now we develop the inner term 
using the fact that A,-V / on a finite set of points. 

= C£s' : S • fi.s' x (Zj : [1..*,-] | E).s = s' • p))) { definition A; } 

= (1/ :S,j: [l..ki] | E'j.s = s' • fi.s' x p)) { distributivity } 

= (Li : [l-.^i] • p'j x p. (E'j.s)) { s' is a function of j } 

= (Li : [l-.iki] • x (j3(£j)).j) { definition of substitution } 



Summing up we have 

wp.T.jS.. = (Hi : /• (£j : [!..*>] • p) x (0(Jjj)) .j) 
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Taking the expression wp.P'.fi in ^ and evaluating in s we get the same expression since only A/ is 
valid. 

The other case is given when all guards are invalid in s. The rhs of the equation is as G Ts, and 
the lhs is a finite sum of 0's. 

For the rest of subprobability measures generated by the up and convex closure in GT.s we show the 
minimum is maintained. Let Ao be the measure achieving the minimum expectation. If for up-closure 
it was added A □ Ao, by monotonicity of the integral we would get a greater expectation f A fi > J Ao ft- 
For and added convex combination pAo + (1 — p)A\, it is easy to show that Ao = pAo + (1 — p)Aq < 
pAo + ( 1 — p)A\, and the minimum is not changed. □ 



3.5 Fixed points 

The theory of fixed points is frequently used to give meaning to iterative programs like the ones we 



outlined in Section 3.3 In this section we present a fixed point theory in accordance with our particular 
domain. 

A poset is a meet-semilattice if each pair of elements has a greatest lower bound (it has meet). An 
almost complete meet-semilattice is a meet-semilattice where every non-empty subset has a greatest 
lower bound (it has infimum). The poset (R>,<) agrees this property by completeness of the real 
numbers. Given a poset (X, C), a function / : X — > X is monotone if x C y =^ f.x C f.y. An element 
i £ I is a pre-fixed point if f.x C x, and if f.x = x then is a fixed point. The least fixed point is denoted 
as ii. f if it exists. 

It's easy to see that KS is an almost complete meet-semilattice because it is the pointwise extension 
of (R>,<) from the set of states S, then it inherits the property. Therefore, we will develop the basic 
fixed point theory focused on these domains. The next result (proved in [6, Lemma 2.15]) establishes the 
existence of the supremum of a subset of an almost complete meet-semilattice: 

Lemma 3.3. Let P be an almost complete meet-semilattice. Then the supremum US exists in P for every 
subset S ofP which has an upper bound in P. 

In general, an almost complete meet-semilattice is not necessarily a complete partial order (CPO). 
For example, the poset (K> , <) does not form a CPO (it lacks a top element) but it is an almost complete 
meet-semilattice. Hence, we cannot prove the general existence of fixed points in this domain. Although, 
under certain conditions, existence of fixed points is guaranteed as stated in [ 18, Theorem 1.4]: 

Theorem 3.4. Let a poset (X,Q) and a monotone function f : X — > X. Suppose X is an almost complete 
meet-semilattice which has a least element _L and f has a pre-fixed point in X. Then f has a least fixed 
point as the infimum of the set of the pre-fixed points of f: jJ-.f = (l~lx | f.x Qx'x) . 

The next section shows a suitable notion of correctness for our programs, using this theorem. 



3.6 Correctness 

The wp semantics allows us to express quantitative properties as expectations. In order to verify this 
properties, we present a notion of correctness based on the fact that any program P = (S,X,C) can 
be written as a pGCL program. Such programs can be constructed as a while loop with body made 



as a nested if-else ladder statement as we mentioned in Section 3.3 The guards of these conditional 



statements are the predicates Aj in ([TJ and its bodies are the non-deterministics probabilistic assignments 
corresponding to each guard as defined in (|2]). The loop guard G is the disjunction of the atoms, so it 
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exits the loop if there is no valid guard in the if-else ladder. The weakest pre-expectation semantics of 
this program is defined in [20] as the least fixed point: 

wp.Ao G^P' od./3 = (jiX • [G] x wp.P'.X + [->G] x J8) 

where G = (VI: P + [0..Z) 'Aj) and P' is the if-else ladder statement. Therefore, we can characterize the 
following idea of correctness: 

Definition 3.5. Let a program P = (S,I,C) with Go, • • • ,G/_i the guards of the commands in C, and 
two expectations a,j8 G E5. Let also G = (V i : [0..Z) • G,-) and / : ES — > ES the expectation transformer 
defined as: 

f.X = [G] x wp.PX + [-.G] x J8 . (4) 
Then, we said that [-G] x j8 is a valid post-expectation of P with respect to the pre-expectation a if 

a^fi.f. (5) 
Based on this definition, if we can find the fixed point q> = fi.f defined from Q then we can show 

that 

[G] x f) ^ wp.P(p 
and [-iG] x <p ee> j8 , 

and these equations denote <p as an invariant of the system verifying the post-expectation j8. Also, if it 
satisfies (|5]), the initialization of the loop is also fulfilled. 

Moreover, our definition of correctness requires the existence of a least fixed point /I ./. Although the 
poset (ES, is an almost complete meet-semilattice, it is not a CPO because M> itself is not (it lacks an 
adjoined oo element). Therefore, we must remark the existence of the least fixed point \l.f is guaranteed 



only if / has a pre-fixed point as we require in Theorem 3.4 There are many cases where pre-fixed 
point do not exist, even for plain pGCL programs. An example of this kind of program is P' G where it is 
obtained from Fig. [T] simply replacing i := i + 1 by the (also linear) i :=2* i. With this modification, it 
can be calculated that wp.P^.i = °°. 



4 Abstract Semantics 
4.1 Preliminaries 

Following 1191. we review the framework of abstract interpretation applied to our problem. Let us con- 
sider two posets r= (X, Q and r" = (X^, C), and a monotone function y — >X called concretization 
function. An element xr G X" is said to be a backward abstraction (b-abstraction) of x G X if y.x" C x. 
The triple (r,r", y) is called abstraction, where T is the concrete domain and P the abstract domain. 

Definition 4.1. Let / : X — > X be a monotone function, p : X" — > X' is a b-abstraction of / if 

(Vjc : X, jc" : X" | yjc"ca; • f.ifJ 1 ) Qfjc) . 

The next theorem, relevant to our work, shows that fixed points can be calculated accurately and uni- 
formly in an abstract domain. 
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Theorem 4.2. Let (r, r" , y) be an abstraction where T is an almost complete meet-semilattice with a 
least element _L and P has a least element JLw with y._L(j = JL. Let also f : X — >• X be a monotone 
function and f* : X" — > X* a b-abstraction of f. Then, if f has a pre-fixed point in X 

1. the supremum of {i : N* y.(/"W.J_jj)} exists anc? a Zowr bound of the least fixed point; that is 
(ui:N-y.(/»«.J_ s ))EM-/ , 

2. /f (U/ : N • y.(/"W._Lj)) /s a pre-fixed point of f then both definitions coincide; that is 
(Ui:N-y.(/»«.±jt))=M-/ • 

Proof. 

1. First, we prove ju./ is an upper bound of the set {i : N • y.(/"W.J_j)}. By induction: If y.JLjj = _L, 
then y._Lj; □ \i.f . Suppose y.(/"W.±(j) C ju./ then 

y.(/«('-+ 1 ).± tt ) = y.(/».(/«0. ±tt) ) 

C f.(ii.f) { inductive hypothesis, Definition [4T]} 

= jx.f { 11 ./ is fixed point of / } 



Hence, /^./ is an upper bound of that set. By Lemma 3.3 the supremum (U i : N* y.(/'W.J_fj)) 
exists and is lower than /I ./ . 



2. If (Uz : N • y.(/*W._L„)) is a pre-fixed point of/, by Theorem gjft./ C (Ui : N • y.(/*W.±jj)) and 
by ([T]) these are equal. 

□ 

The first part of this theorem shows that we can obtain a lower bound of the least fixed point over an 
abstract domain. Then, given a program P and an abstract domain, this lower bound can be used to prove 
the correctness of the program ([5]>. The second part gives us a sufficient condition to achieve exact fix 
point calculation in the abstract domain. We will go into these points in the next section. 

4.2 Random Variable Abstraction 



By Theorem 4.2 if we choose an appropriate abstract domain then we could calculate or 



under-approximate the fixed point \i.f (defined in Section 3.6 1 in order to prove the validity of ([5]). We 



begin by defining a suitable abstraction (r, P , y) . The basic idea consists in generalizing predicate-based 
abstraction theory (PA) ||8j 01 to expectations. In PA the abstraction is determined by a set of Af linear 
predicates {p\ , • • • ,pn}. The abstract state space is just the lattice generated by the set of all bit-vectors 
(pi,-- - ,bjf) of length whose concretization is defined as J.{b\, ■ ■■ ,b^) = (Ai : [1..A] • bj = pi). Then, 
any predicate in the concrete domain can be represented as a series of truth values over each of these 
atoms. Our RVA generalize these values by linear functions on the program variables. 
Given a program P = (S,J,C) over the set of variables X = {x\,- • • ,x„} and a set of disjoint convex 
linear predicates 0i , • • • ,(j) m the abstract domain will be r" = {M^ n+1 ^ xm , ^). It can be thought as the set 
of parameters of linear expectations over the program variables for each linear predicate fa. Hence, the 
concretization function is defined as: 

y.((^,...,^),...,(^,.-.,<))^(I/:[l.,n] • («?(, + £7 : [l..n] -q) x Xj ) x [ft])) (6) 

and the order relation as x$ ^ y> = yjc" ^ y.y* . The predicates 0i, • • • ,<j) m can be constructed from the 
guards of the program [8,4] and taking the complete boolean algebra [6] over this finite set of assertions. 
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The next step is to define a b-abstract ion o f the transformer Q. First note, as we want to approximate 
fJL.f by (Ui : N • /.(/^.-Ljj)) (Theorem 4.2 1, we only need to obtain the set {i : N'/^.-Uj such that 



4.2 



we can 



expectations in {i : N* y.(/"W.J_jj)} form an ascending chain in ES. Thus, using Theorem 
approximate by calculation the least fixed point of the transformer as the limit of this chain. Moreover , the 

o(n+l)xm 



function f* 
that is 

If we suppose 



j>(«+i)xm oyer elements of the defined set must agree with Definition 

(Vjc:E5,i:N | y.{fi®.±. t )^x • y.(/"'' +1 U s ) f.x) . 



4.1 



y.(/^.± s )^/.(y.(/fW.±j)) 



(7) 



then ft agrees the above definition: 



f.(y-{f m -M))^f-x 

Y.(ft^.± s )^f.(r.(f m .±t)) 

Af.( Y .(f m .±t))^f.x 
y.(fW +l \± i )^f.x 



{ f monotone } 

{ by supposition (jT]) } 
{ by transitivity of ^ } 



Taking into account (M) we construct the set {i : N • /"W._Ljj}, where the bottom element _l_ y is clearly 
the null matrix in R( n +m»\ Also, if we have defined _Lj we obtain ft^' +1 \±^ as follows: first 
we calculate f.(y.(ft^.±^)) and then we choose an expectation in the lhs of ^ that can be accurately 
represented in r*. Let /"W._Lj = ((q^, • • • • • , (q^ , ■ ■ ■ ,q„)) and /}* a b-abstraction of j3. The former 
can be obtained: 



/.(y.(/«W.±j)) = [G] x wp.P.(Ei : [l..m] • (?< + (I j : [l..n] • ^ x *;) x [«/>,•])) 

+ [-iG] x y.jS* { by © and @ } 

= [G] x (£/ : P+[0.i) • [A,] x (|> :/• (S> : [1..*,] «K x #))) 
+ hG]xy./3» {by©} 

where 

itf = (Ei : [l..m] • (qi + (£ j : [l..„] 'JjXXj)(E}) x . 

Then, we restrict the above result within each domain defined by predicates fa obtaining a set of expec- 
tations. Although, these expectations are not necessarily linear, if the program is linear (we have only 
linear assignments E s r ), each of them can be lower bounded by linear functions g,- agreeing Q. Also, we 
choose each function gj such that they are greater or equal than [fa] x y.(/"W._L(j) ensuring monotonicity 
of the constructed chain. So, each gi must be bounded above and below: 



Mxy.ru,) [<t> i ]x gi Wx/.( 7 .(/«".i,)) . 

Note that it is always possible to find these bounded expectations g; because / is monotone 
(then y.(/"(')._Ljj) ^ /.(y.(/"W.±jj))) and the lower bound [fa] x y.(/**W.±jj) defines an hyperplane 
over the set of states fa. Also, as gi,---,gm are linear functions, we can find the parameters 
(Wl- ,</«),•■■ .W,- ,q'n)) £R ( " +1)xffl such that eachft.fa,- ,%) = 4 + (Ii: [l..«] -^x^). 
Then we define 4 . . . , q ' l n ), . . . , . . . ,</«)). Thus, we abstract /.(y.(/««.J_ a )) by 

choosing a lower expectation that could be accurately represented in r". 
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Henceforth, by applying this method we can reach a fixed point (p* in the abstract domain provided that 
exists a pre-fixed point of transformer / (condition from Theorem |4.2| ). By Theorem |4.2| [Tj), expecta- 
tion 7.9" is a lower bound of \i.f and can be used to prove the correctness of the program regarding 
post-expectation /3. Also, if we can check 7.9" is a pre-fixed point of / then, by Theorem |4.2[ [2|), it is the 
least fixed point of / and our abstraction works accurately. 
The following section shows an example using this method. 

5 Martingale Example 

In this section we give a more detailed example of our probabilistic program verification through random 
variable abstraction, computing the average capital a gambler would expect from a typical martingale 
betting system (Fig. [3]>. The gambler starts with a capital C and bets initially one unit. While the bet b is 
lower than the remaining capital c, a fair coin is tossed and if it is a win, the player obtain one hundred 
percent gain and stops gambling, otherwise it doubles the bet and continues. The expected gain of this 

Pj 4 c ,b:=C,\ 

;do < b < c -> 

c := c — b 

;c,b:=c + 2b,0 ©1 b := 2b 

2 

od 



Figure 3: Martingale. 

strategy is given by wp.P\ x . 



Following Section 4.2 we first define the set of variables X and the set of disjoint convex linear 
predicates fa. For the variables we take the whole set {b, c} since both participate in the control flow. The 
predicates involved should, in principle, include the guard < b < c and its negation as it is customary in 
PA. However the update operations in the program produce movements in the bidimensional state space 
that cannot be accurately captured by this two regions. We propose an abstraction that divides the two 
dimensional space (b,c) into six different regions (predicates). The regions are defined by the four atoms 
given by the inequalities < b, b < c, plus two other inequalities that define the six total orders between 
0,b and c. The concretization function 7 of the abstract domain M( 2+1 ) x6 is the sum of a linear function 
on b, c in each region. 

</>i=0<c<& </>4 = 6<c<0 y=YS =l (ciC + bib + ai)[fa] 

<j) 2 = 0<b<c (j) 5 =c<b<0 
(j) 3 =b<0<c (j) 6 = c<0<b 

If we disregard the initialization and coalesce the two assignments, the program can be transformed into 
an equivalent P[, but following the syntax of Section[3] It consists of just one guard: 

P[^0<b<c^(c,b:=c + b,0)@± \ (c,b := c-b,2b)@\ . 

The fixed point computation is f.X = [G] x wp.P[.X + \pG\ x j8 with post-expectation j8 = c. The 
term wp.P[.X is given by ^ and the other summand is given by the final state condition on the 
post-expectation. 

f.X= [0<b<c] x (lx{c,b:=c+b,0) + lX{c,b:=c-b,2b)) + [->(0<b<c)]xc . 
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The function p is b-abstraction of / that we are going to calculate now. We apply / to the concretization 
function y over generic coefficients, and obtain three blocks of summands given by the left assignment, 
the right assignment and the post-expectation. 

f-Y= \(c 3 c + c 3 b + a 3 )[0 < c + ft] 

+ \ ((cic + (2bi - ci )b + ai ) [b < c < 3b] + (c 2 c + (2ft 2 - c 2 )ft + a 2 ) [0 < 2ft A 3b < c]) 
+ (Ii:[1..6] |i^2-(lc+Q& + 0)[fc]) . 

Clearly the first three summands do not fit into the abstraction given by predicates </>,-, however we can 
give linear lower bounds on each of them that can be accurately represented in the abstract domain P" as 
stated in Section 4.2 The region < c + b includes 2 or in terms of random variables [0 < b < c] ^ [0 < 
c + b] , so we keep the linear function in the smaller region. The other two regions *Pi = b < c < 3b, *F 2 = 
< 3b < c form a partition of 2 . It is needed to find a plane that in *Pi is below (c\c+ (2b\ — c\)b+a{) 
and in *P 2 is below (c 2 c + (2ft 2 — c 2 )ft+a 2 ). The two planes form a wedge (Fig.Q, so our lower bounding 
plane would put a base on it. In the region *Pj the function simplifies to c — ft and the minimum in the 
region is when c = ft. For *P 2 the minimum is c 2 achieved in ft = 0. Putting it all together we obtain 




Figure 4: Piecewise linear function to be minimized by a plane. 



the final linear expression that bounds /./from below and it is in terms of the original predicates: 

/.7>( £2 2 ti c+^ft)[(/. 2 ] + (I/: [1..6] |i^2-(lc + 0ft + 0)fe]) . 

Note that doing this we comply with Eq. [7] and this in turn implies that /" is a b-abstraction. We 
explicitly write down /" and iterate it up-to fixed point convergence in Table [2] As expected, 04,05,06 
do not contribute in the fixed point computation since < c is a program invariant, so we do not include 
them: 

/*.((ci,fti,ai),(c 2 ,ft 2 ,a 2 ),(c3,ft3,a 3 )) ^ ((1,0,0), ^,0), (1,0,0)) . 

We can see /"W._Ljj ^ /^ ,+1 ^._Ljj, so the under approximation y.(/^ !+1 )._Ljt) tends from below to 
(£/: [1..6]-(lc + Oft + O)[0,-]) =c. 

The initialization c,ft := C, 1 does the rest obtaining a lower bound on the expected remaining capital, 
that is precisely C. It can be proven that c is a prefix point of / (in fact it is a fixed point), and using 
Theorem 4.2 (2]), we establish fi.f = c, therefore our approximation is exact. Moreover, as the program 



is purely probabilistic, C is the expectation of c with respect to P. 
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Coefficients 


Iter. 


(ci, £>i, ai) 


(c 2 , b 2 , a 2 ) 


(C 3 , fe 3 , fl 3 ) 





(0, 0, 0) 


(0, 0, 0) 


(0, 0, 0) 


1 


(1, 0, 0) 


(0.5, 0.5, 0) 


(1,0, 0) 


2 


(1, 0, 0) 


(0.75, 0.25, 0) 


(1,0, 0) 


3 


(1, 0, 0) 


(0.875, 0.125, 0) 


(1,0, 0) 


4 


(1, 0, 0) 


(0.9375, 0.0625, 0) 


(1,0, 0) 


5 


(1, 0, 0) 


(0.96875,0.03125, 0) 


(1,0, 0) 


6 


(1, 0, 0) 


(0.984375, 0.015625, 0) 


(1,0, 0) 


7 


(1, 0, 0) 


(0.9921875, 0.0078125, 0) 


(1,0, 0) 


55 


(1, 0, 0) 


(1. 0, 0) 


(1,0, 0) 



Table 2: Iteration for the martingale. 



6 Aims and conclusions 

This paper presents a technique for computing the expectation of unbounded random variables tailored 
to performance measures, like expected number of rounds of a probabilistic algorithm or expected num- 
ber of detections in anonymity protocols. Our method can check quantitative linear properties on any 
denumerable state space using iterative backwards fixed point calculation. 

Perhaps the main drawback of the method is that it is semi-computable but it covers cases where 
previous work cannot be applied (geometric distribution, martingales). Besides, it seems hard to bound 
expectations of programs syntactically since a minor (linear) modification in the geometric distribution 
algorithm leads to a unbounded expectation for a program that halts with probability 1 . 

In future work we would like to build tool support for our approach. This would involve, among 
other tasks, the mechanization of the weakest pre-expectation calculus in the abstract domain, as well 
as the maximization problem that involves computing a lower linear function in each iteration. As our 
technique works on linear domains, this later task would be easily solved by known linear programming 
techniques. 

We also plan to analyze more complex programs. The Crowds anonymity protocol modeled as in OTTl 
is a good candidate for our automatic quantitative program analysis, since its essential anonymity prop- 
erties are expressed as the expected number of times the message initiator is observed by the adversary 
with respect to the observations obtained for the rest of the crowd. It is also planned to reproduce the 
results of Rabin and Lehmann's probabilistic dining -philosophers algorithm |[T6ll . 

Acknowledgements. We would like to thank Pedro D'Argenio for his support in this project, as well 
as Joost-Pieter Katoen for handing us an early draft of |[T4ll . 
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